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I. PURPOSE 


N RECENT years there has been an in- 
I creasing emphasis on the factor of 
motivation as an influence on the per- 
ceptual processes. A perception is not a 
literal copy of the stimulus, and indi- 
vidual differences in perceiving cannot 
be explained solely in terms of the ob- 
jective stimulus pattern, primary con- 
figural tendencies, or idiosyncrasies in 
sensory, neural, or motor capacities. 
Needs, wants, values, and tensions can 
be significant organizing factors in per- 
ception; and it follows that the more 
unstructured, ambiguous, and amor- 
phous the stimulus situation, the greater 


* This study is an abridgment of a dissertation 
in the Department of Psychology submitted to 
the faculty of the Graduate School of Arts and 
Science, New York University, June, 1952, in 
partial fulfillment of the requirements for the 
degree of Doctor of Philosophy. The writer wishes 
to express his gratitude. to Professor Thomas N. 
Jenkins, whose perspicacious observations gave 
impetus to the exploration of promising leads, 
and to Dr. William D. Glenn, Jr., former director 
of the New York University Testing and Advise- 
ment Center, who very kindly consented to the 
special testing of the World War II veteran and 
nonveteran clients. Heartfelt thanks are likewise 
proffered to Dr. Bernard Locke, chief clinical 
psychologist of the Mental Hygiene Unit of the 
Brooklyn Regional Office of the Veterans Admin- 
istration, and to Dr, Richard H. Paynter, chief 
clinical psychologist of the Mental Hygiene Unit 
of the New York Regional! Office of the Veterans 
Administration, whose generous cooperation 
made the testing of neurotic outpatients possible. 
Finally, the writer wishes to acknowledge his 
great debt to his wife, whose devotion and warm 
understanding served as a never-ending source 
of inspiration and to whom this study is affec- 
tionately dedicated. 


the likelihood that any given individual 
will project his needs into the situation 
in an effort to give it meaning and form, 
and the greater the individual variation 
in perceptual response that may be an- 
ticipated. 

Although the interest of the clinical 
psychologist and the student of person- 
ality developed some time after the ex- 
perimental investigation of perception in 
the laboratory, the diagnostic and thera- 
peutic possibilities of graphic-motor be- 
havior as revealed in artistic productions 
were recognized by the clinician long 
before such behavior was made the ob- 
ject of systematic experimental investiga- 
tion. Anastasi and Foley (1, 2, 3, 4) in a 
series of articles surveyed the literature 
on artistic behavior in the abnormal, a 
literature in which theory and specula- 
tion are almost as numerous as undis- 
puted fact. Interest in graphic behavior 
has not, of course, been limited to the 
study of art and art forms, since it has 
been recognized in the past two decades 
that virtually all forms of graphic ex- 
pression may have implications for per- 
sonality evaluation and diagnosis. Rele- 
vant in this connection are the ap- 
proaches of Werner Wolff (46, 47, 48, 
49, 50), Saudek (38), Machover (30), 
Mira (31), Sapas (36), Foster (22), and 
Bender (g, 10, 11, 12, 13, 14, 15)- 

Bender is generally given credit for 
taking the experiments on the reproduc- 
tion of visually perceived form, whether 


2 


from copy or from memory, out of the 
laboratory and into the clinical setting, 
and there is litthe question that she did 
more than any other individual to see 
the diagnostic possibilities of the method 
and popularize its use. Impressed by the 
experimental studies of Wertheimer, 
Kohler, and Koffka in the Gestalt theory 
of perception, she published a mono- 
graph (12) in 1938 summarizing her find- 
ings on the alteration of the “Gestalt 
function” in the copying of geometrical 
designs by children and abnormal pa- 
tients. The test itself consists of nine 
geometrical designs which were selected 
from approximately thirty which Wert- 
heimer (44) used in his classical study 
of Gestalt principles. Each design is on 
a separate card, and the subject is asked 
to copy them one at a time. Other re- 
ports of the clinical use of the Bender- 
Gestalt ‘Test have been published by 
Schilder (99, 40), Bender, Curran, and 
Schilder (16), Orenstein and Schilder 
(34), Fabian (20), Hutt (27, 28), Wayne, 
Adams, and Rowe (41), Harrower (25), 
David Wechsler (42), Israel Wechsler 
(43), Barkley (7), Billingslea (17), Wolt- 
mann (51), and Pascal and Suttell ($5). 
While Bender was primarily interested 
in the Gestalt test as an instrument for 
the exploration of the Gestalt function, 
with emphasis upon its maturational and 
pathological aspects, Hutt (28) conceived 
of the test as a projective personality 
technique, rich in interpretive signifi- 
cance and psychodynamic implications 
and capable of contributing valuable 
clues to differential diagnosis. Clinical 
psychologists have been quick to follow 
Hutt’s lead rather than Bender’s more 
prosaic approach because they have long 
sought for a projective device less time- 
consuming than the Rorschach or TAT 
which may be employed as a brief sup- 
plement to the longer techniques in 
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yielding new or confirmatory personality 
data. Moreover, Bender’s monograph is 
of rather limited usefulness to the clini- 
cian because of its emphasis on extreme 
deviations, the relative absence of con- 
trol data, and the failure to support in- 
terpretations with quantitative findings. 
‘The reader is generally left in the dark 
concerning the actual number of cases 
upon which she‘ bases her conclusions. 
One gets the impression that there is a 
Bender-Gestalt record “typical” of the 
schizophrenic or “typical” of the aphasic 
and is then puzzled when he runs across 
a record in the clinic which fails to fit 
the “classical” picture. But if Bender's 
articles lack specificity, the criticism 
might be leveled against Hutt that he 
offers “too much too soon” and _ that 
claims are made for the test which have 
yet to be validated. Although he states 
that the “clinical syndromes” for the psy- 
choneurotic, the schizophrenic, and the 
brain-injured patients are “substanti- 
ated by the author's research studies,” 
he has not (to the writer’s knowledge) 
published his original data in quantita- 
tive form, by way of reporting either 
validity coeflicients or the proportion of 
false positives or false negatives to be 
anticipated, Furthermore, many of his 
“determinants,” or scoring deviations, 
are vaguely defined, resulting in varying 
interpretations of them by clinicians who 
employ the same terms but assign them 
different meanings. 

It may well be that Bender and Hutt 
are capable of drawing inferences from 
the test which provide valuable insights 
into the patient with whom they are 
working, but it may also be that they are 
relying heavily upon minimal cues based 
upon long experience with the test which 
are generally labeled as “clinical intui- 
tion” and which may be less readily com- 
municable to others. The fact of the 


| 


THE BENDER-GESTALT TEST ON NORMAL AND NEUROTIC ADULTS 3 


matter is that the majority of clinical 
psychologists using the test interpret it in 
a more or less rough, subjective, global 
manner, drawing psychodynamic impli- 
cations on the basis of essentially unsys- 
tematized clinical observation. If the 
test is to have real value to the clinician 
and if it is to be given univocal inter- 
pretation by those who use it as a diag- 
nostic or prognostic instrument, both 
Bender's and Hutt’s provocative ideas 
should be stated as hypotheses to be sub- 
jected to rigorous experimental or clini- 
cal validation. 

The need for quantification and vali- 
dation of all projective tests has been 
recognized with i :casing frequency in 
recent years, despite the objections of 
some clinicians who seem to feel that 
any attempt to apply statistics to pro- 
jective devices would automatically de- 
stroy their usefulness as instruments for 
assessing the “global personality.” Sar- 
gent (37) has presented a strong case 
for quantification, and White (45), Bell 
(8), Cronbach (18), and Murphy (32) 
have expressed themselves in a similar 
vein. With regard to the Bender-Gestalt 
Test, the majority of psychologists using 
the test seem to feel that it is helpful in 
working with organics and psychotics but 
disagree as to its validity with neurotics. 
Bender (12) reports no data on neurotics, 
and Woltmann notes that “very often... 
a clearly established diagnosis of psycho- 
neurosis from the test battery is accom- 
panied by very normal copies of the 
Gestalt figures” (51, p. 348). Billingslea 
(17) and Pascal and Suttell (35) have 
made efforts to quantify the Bender- 
Gestalt, arriving at almost diametrically 
antithetical findings with regard to the 
neuroses. Billingslea’s findings are essen- 
tially negative, while Pascal and Suttell’s 
are strongly positive. 

The general purpose of the present in- 


vestigation is to determine whether neu- 
rotics and normals can, in fact, be dis- 
tinguished on the basis of their Bender- 
Gestalt records. Most clinical psycholo- 
gists are agreed that psychotics and or- 
ganics show gross deviations on the test 
that are sufficiently bizarre or unusual 
to identify such cases fairly readily with- 
out the necessity of developing an elab- 
orate scoring system, but, as noted above, 
are less certain of the test’s applicability 
to patients suffering from neurotic con- 
ditions. Before inquiring into the validity 
of the psychodynamic interpretations of 
the individual scoring factors as applied 
to neurotics, it would appear that the 
logically prior step would be to ascertain 
whether the scoring factors themselves 
differentiate between neurotic and nor- 
mal subjects. If a sign alleged to reveal 
anxiety or emotional immaturity occurs 
with equal frequency in both groups, one 
might well question the validity of such 
an interpretation, ‘Phe specific aims of 
this investigation, then, are to ascertain 
the discriminating power of the various 
scoring elements, to isolate any which 
exhibit differential validity as regards 
normals and neurotics, and, if possible, 
to combine these discriminating elements 
into a total score which will eflectively 
separate the contrasting clinical groups. 
‘To accomplish these aims, it is proposed 
to develop an objective scoring system, 
defining the scoring elements opera- 
tionally rather than conceptually or 
interpretively, and to subject the ob- 
served deviations of reproductions from 
stimulus figures to a statistical appraisal 
similar to that used in item-analysis tech- 
niques. It should be emphasized that this 
is a study of empirical validity; the effort 
is not to identify psychological processes 
or psychodynamic correlates, but to de- 
fine the test’s uses and limitations in 
practical terms. 
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II. PROCEDURE 


A. ADMINISTRATION OF THE TEST 


Although Bender did not provide any 
standard instructions for administering 
the test, almost all users of the Bender- 
Gestalt have employed the same basic 
method of administration. The stimulus 
cards, each of which contains a test 
figure, are presented one at a time, and 
the subject is asked to copy them while 
keeping the card in full view. Rulers or 
mechanical guides are not permitted, 
but otherwise the subject is free to pro- 
ceed as he chooses. Questions about how 
the figures are to be copied are usually 
answered in a noncommittal manner or 
referred back to the subject for decision. 
There is no time limit, and the repro- 
ductions themselves are not timed. Era- 
sures, crossing out of part of a figure, or 
making more than one attempt to re- 
produce a figure are allowed. 

In the present study three major inno- 
vations in administration of the test were 
introduced. (a) Each reproduction was 
timed, Although all previous investiga- 
tors have ignored the time factor except 
to observe that extremely long or short 
times should be noted as a qualitative 
datum, the present investigator felt that 
time might be a significant factor in its 
own right and that its inclusion was as 
justifiable in connection with the 
Bender-Gestalt as it has been with word 
association tests and the Rorschach. (b) 
The administration of the initial test 
was followed by a test of immediate re- 
call. As soon as the subject had copied 
all the figures, all stimulus cards were 
removed from sight and he was asked to 
draw the figures from memory. ‘This 
modification of the basic procedure was 
introduced on the hypothesis that if any 
differences in test performance are ex- 
hibited by normals and neurotics, they 
will be accentuated in the absence of 


an objective stimulus. Numerous stud- 
ies (e.g., 21, 23, 24) on memory for form 


. show pronounced alterations of the re- 


productions in normal adults, and it is 
conceivable that certain latent tenden- 
cies that are held in check during the 
copying process might, under the pres- 
sure of an unexpected test of incidental 
memory, reveal themselves as distinctive 
of neurotic adults. Moreover, if emo- 
tional blocking is heightened under 
these conditions, one would expect the 
total number of figures recalled by the 
neurotic group to be significantly smal- 
ler than for the control group. (c) An 
immediate retest followed the immediate 
recall. As soon as the subject had com-- 
pleted his recall of the figures, he was 
asked to copy the figures from the mod- 
els in a procedure identical with that 
employed for the initial test. This pro- 
cedure permits a comparison of per- 
formance under conditions in which the 
test materials and procedures are first 
unfamiliar and then familiar and sheds 
light on the question of whether ob- 
served deviations occurring during the 
first administration of the test persist or 
disappear under a second administra- 
tion. Unlike previous studies, such as 
that of Pascal and Suttell (35), in which 
retests were administered to a compara- 
tively small proportion of the total 
group tested merely for purposes of esti- 
mating reliability, the retest was em- 
ployed in the present research as an in- 
tegral part of the testing procedure with 
implications for both validity and _ re- 
liability. 

The test was administered to the con- 
trol group by psychologists at the New 
York University Testing and Advise- 
ment Center and to the experimental 
group by clinical psychologists at two 
Veterans Administration mental hygiene 
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units in New York City. While the 
writer administered a number of the 
tests himself, the great majority of tests 
were given by others on the assumption 
that the results would have more uni- 
versal applicability if obtained under 
conditions approximating those in actual 
clinical practice. ‘This procedure would 
also reduce to a minimum any uncon- 
scious bias on the part of the present 
investigator to influence the outcome. 
The instructions were explained and il- 
lustrated by the writer at staff meetings 
of the installations where the test was 
given, and opportunities were provided 
for the giving of practice tests to answer 
any questions that might arise. 


B. SuBJEcTs 


The neurotic group consisted of 108 
World War II veterans who, at the time 
of testing, were undergoing treatment 
at the mental hygiene service of the 
Veterans Administration. ‘The cases were 
drawn from the mental hygiene clinics 
of the New York Regional Office and the 
Brooklyn Regional Office, with the ma- 
jority coming from the latter installa- 
tion. Inservice diagnoses, on the basis 
of which these subjects were receiving 
disability pensions, were not used in 
view of the suspicion that many such 
diagnoses were more a matter of practical 
expedience than of scientific accuracy. 
Rather, the nosological classifications 
were predicated upon judgments arrived 
at by one or more staff psychiatrists on 
the basis of case histories, clinical inter- 
views, and the results of an extensive 
battery of diagnostic psychological tests 
administered by staff clinical psychol- 
ogists. The Bender-Gestalt was incorpo- 
rated into the battery for purposes of 
this study so that it would be regarded 
by the subjects as part of the normal 
testing procedure, but in no instance was 


the test used to establish the diagnosis. 
If there was any doubt in the minds of 
the psychiatrists or psychologists about 
the diagnosis of a given patient, the 
case was not included. Cases with com- 
plicating features such as gunshot 
wounds, ulcers, migraine, asthma, psy- 
chopathic personality, schizoid trends, 
orthopedic disabilities, or neurological 
involvement were automatically  ex- 
cluded, the effort being to obtain “pure” 
neurotic Cases in so far as possible. There 
was less concern for differential diagnosis 
within the neurotic rubric than for dif- 
ferential diagnosis between the neurotic 
and nonneurotic psychiatric patients in 
view of the unreliability of specific diag- 
nostic categories, which Ash (6) and oth- 
ers have demonstrated. Only native white 
male adult outpatients were included in 
the experimental criterion group. The 
criteria for selecting this group were so 
rigorously adhered to that over three 
years were required to collect the data. 

The control or “normal” group con- 
sisted of 285 World War II veterans 
availing themselves of aptitude testing 
and vocational counseling under the pro- 
visions of Public Law 346, generally re- 
ferred to as the GI Bill of Rights. Only 
native white male nondisabled veterans 
receiving no pension for physical or 
neuropsychiatric disability were in- 
cluded. For purposes of this study, a 
veteran was considered “normal” pro- 
vided he had _ no history of neuropathic 
traits or emotional maladjustment, gave 
no evidence of nervous mannerisms or 
emotional disturbance during two hour- 
long interviews or during a series of 
testing sessions ranging in total time 
from three to eleven hours, had not been 
given a neuropsychiatric diagnosis in 
service, and had not subsequently re- 
quested or received psychiatric treatment. 

In order to refine the control group 
still further, the subjects in this group 
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were delimited by the application of a 
psychometric as well as a clinical cri- 
terion. The psychometric instrument 
used for this purpose was the Minnesota 
Multiphasic Personality Inventory (group 
form), which has been clinically stand- 
ardized and validated. It is believed that 
deliberate misrepresentation of responses, 
a pitfall in employing any personality 
inventory, was reduced to a minimum 
by administering the test in a vocational- 
guidance rather than a personnel-selec- 
tion situation. Since a standard score of 
70 is generally regarded as the dividing 
line between normal and abnormal 
scores on any one of the nine scales, the 
control group was subdivided into three 
subgroups as follows: 

1. Below-7o Group (N = 155). All 
MMPI scores fall below 7o. 

2. rabove-7o Group (N = 68). Only 
one of the MMPI scores equals or ex- 
ceeds 70. 

3. 2-above-70 Group (N = 62). Two 

or more of the MMPI scores equal or 
exceed 70. 
This breakdown into subgroups permits 
comparisons within the control group, 
as well as between the control group and 
the neurotic subjects, and increases the 
probability that at least one sizable 
portion of the controls is sufficiently 
“normal” to meet the rigorous standards 
of a research study using the method 
of contrasting groups to establish test 
validity. 

Along with the MMPI, all control sub- 
jects took the Otis Gamma test so that 
the relationship between Bender-Gestalt 
‘Test performance and intelligence might 
subsequently be explored. Ideally, it 


would have been desirable to have ad- 
ministered the MMPI and the Otis to the 
neurotic subjects as well, but it was not 
feasible to make such additions to the 
regular diagnostic test battery employed 
at the mental hygiene clinics from which 
the experimental subjects were drawn 
because the need to process cases would 
have made the testing time prohibitive. 
In the neurotic group, 84 of the 108 
subjects were, however, given the 
Wechsler-Bellevue, and the relationship 
between Bender-Gestalt ‘Test perform- 
ance and scores on this test will be dis- 
cussed subsequently. 

The mean age of the total control 
group (N = 285) is 24.22 years, while 
that of the total neurotic group (N = 
108) is 29.76 years. The standard devia- 
tions are 4.70 and 5.74, respectively. 
With regard to education, the mean is 
12.47 grades for the controls and 10.74 
for the neurotics, with standard devia- 
tions of 1.72 and 2.89, respectively, Of 
the normals, 68.7 per cent are single, 
compared with 37.0 per cent of the neu- 
rotic subjects. Of the controls, 69.7 per 
cent are Army veterans, 23.9 per cent 
Navy, 3.9 per cent Marines, and 2.5 per 
cent Coast Guard. For the neurotics, the 
corresponding figures are 84.3, 14.8, 0.9, 
and 0.0 per cent. The three subgroups 
of the total control group are remarka- 
bly similar with regard to age, educa- 
tion, marital status, and branch of ser- 
vice, and it is noteworthy that the ma- 
jority of subjects from both the control 
and neurotic groups are between 20 and 
go years of age, the modal grade is the 
12th, and the modal service affiliation 
is the Army. 


Ill. THE SCORING SYSTEM 


Bender did not develop any system of 
scoring for her test because her concern 
with gross distortions and bizarre devia- 
tions of the reproductions from the 


original stimuli made such a refinement 
unnecessary. In the present research the 
working hypothesis has been adopted 
that subtle differences between normals 
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and neurotics may in fact exist, and 
that the way to test this hypothesis is to 
develop a reasonably objective and com- 
municable method of scoring that will 
reflect those nuances of drawing which 
contribute to relatively minor modifica- 
tions of the original stimuli. 

A distinction has been made in the 
present investigation between “graphic” 
signs and “methods” signs.? The graphic 
signs are defined as those characteristics 
of the reproductions which are scoreable 
solely by inspection or measurement of 
the figures appearing on the test record; 
they do not presuppose observation of 
the subject's test behavior. Examples are 
size, contiguity, substitution, and asym- 
metry. The methods signs are defined as 
those characteristics of the reproductions 
which are scoreable only by direct obser- 
vation of the subject's test behavior. Ex- 
amples are counting, paper rotation, 
direction, and initial part. 

The final list of signs consists of 82 
scoring categories, many of which are 
represented in more than one figure, the 
total number of global and individual 
figure signs being g12. Each scoring 
category is numbered, and an individual 
sign is designated by this number and 
by the number of the figure on which 
it appears. “Guides,” for example, is 
scoring category 9; should guide lines 
be used to reproduce figure 5, the desig- 
nation would be 9-5. Similarly, “initial 
part” is category 65; should the diamond 
in figure A be made before the circle is 
drawn, and should the bell of figure 4 
be drawn before the open square, the 
designations would be 65-A and 65-4, 
respectively, Certain categories are repre- 
sented in only one figure as, for example, 
“pairing” (number 70) which occurs 


* For definitions of the specific “graphic” and 
“methods” signs mentioned in this paragraph, 
see section A below, entitled “Individual Figure 
Signs.” 


solely in figure 1 and is designated as 
70-1. Other categories, such as “rotation: 
total” (number 48) may occur in all 
figures, and the designation is 48-A, 48-1, 
etc. “Global” categories such as sequence, 
cohesion, and colliding refer to the place- 
ment of all the figures on the page and 
hence have no specific figure designation. 

The scoring categories are defined be- 
low. It will be noted that certain numbers 
are omitted in designating categories 
(e.g., there are no categories numbered 
3, 4, or 13). Originally, the individual 
figure categories were numbered from 1 
to 86, but certain categories proved to 
be impracticable or too susceptible to 
subjective judgment in an exploratory 
trial run completed prior to the under- 
taking of the present study. Rather than 
renumber the retained categories, it was 
decided to adhere to the earlier numeri- 
cal designations. 


A. INDIVIDUAL FIGURE SIGNS 


1. Length (all figures). Distance (in millimeters) 
of the horizontal extent of the reproduction. 
Measure from the extreme left to the extreme 
right of the reproduction, using a transparent 
millimeter ruler. 

2. Height (all figures). Distance (in millimeters) 
of the vertical extent of the reproduction. Meas- 
ure from the extreme upper to the extreme 
lower part of the reproduction, using a trans- 
parent millimeter ruler. 

5. Tremulous line (figures A, 4, 6,7, 8). A shaky, 
quivering, wavering line. 

6. Erasure (any figure). All or part of a repro- 
duction exhibits evidence of erasure. 

7. Multiple attempt (any figure). Two or more 
reproductions of a given figure appear on the 
test record. The initial reproduction may involve 
the entire figure or just part of it. The initial 
reproduction may be crossed out, partially erased, 
or left intact. Not scored if the initial reproduc 
tion is erased completely and the second attempt 
is superimposed upon it. 

8. Figure A placement (figure A), One or more 
reproductions appear above figure A on the test 
record, or figure A is three inches or more from 
the top of the page. 

g. Guides (any figure). Use of extraneous lines 
or points as an aid to reproducing the figure. 
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10. Overshooting lines (figures A, 4, 7, 8). Lines 
overshoot one another at points of juncture. 

11. Retracing (figures A, 4, 6, 7, 8). A part of 
a line or one or more lines are made with repeti- 
tive strokes which are more or less superimpose: 
on one another. The line is belabored rather 
than sketched. 

12. Projection retracing (figures 7, 8). One or 
more of the pointed projections of figures 7 or 8 
are retraced as defined in category 11. Scored also 
for Retracing (category 11). Not scored when 
entire reproduction is retraced, Not scored if 
only the diamond in figure 8 is retraced. 

14. Substitution: dot-dash (figures 1, 3, 5). Dots 
are reproduced as horizontal or vertical dashes at 
least jg inch in length. At least four dots should 
be so reproduced to be scored. 

15. Substitution: dot-circle (figures 1, 3, 5). Dots 
are reproduced as unfilled circles or loops. At 
least four dots should be so reproduced to be 
scored, 

16. Substitution: dot-ball (figures 1, 3, 5). Dots 
are reproduced as filled-in circles or loops. At least 
four dots should be so reproduced to be scored. 
Slightly enlarged dots are not scored. 

17. Substitution; dot-scribble (figures 1, 3, 5). 
Dots are reproduced as “wiggly” lines, “butter- 
fly” lines, or scribbles. At least four dots should be 
so reproduced to be scored. 

18, Substitution: circle-dot (figure 2). The cir- 
cles in figure 2 are reproduced as dots or filled-in 
loops. At least four circles should be so repro- 
duced to be scored. 

20, Numeration (figures 1, 2, 3, 5, 6). The num- 
ber of dots, circles, columns, or waves in the re- 
production differs from the actual number in 
the stimulus figure. The correct number of cle- 
ments is as follows for the figures concerned: 


figure 1 12 dots 

figure 2 11 columns or 3g circles 

figure 3 1, 3, 5, and 7 dots 

figure 5 19 dots in semicircle, 7 dots in tan- 
gential line 

figure 6 4 wave crests in horizontal and 4 


in vertical 

21. Wave accentuation (figure 6). Marked in- 
crease in amplitude of horizontal or vertical 
wavy lines. At least two of the waves should be 
so reproduced to be scored. 

22. Wave flattening (figure 6). Marked decrease 
in amplitude of horizontal or vertical wavy lines. 
At least two of the waves should be so reproduced 
to be scored. 

23. Wave irregularity (figure 6). Marked varia- 
tion in wave length or amplitude of horizontal 
or vertical wavy lines. At least one of the waves 
should differ markedly from the others to be 
scored. 

25. Curvature contraction (figures 4, 5). Marked 
constriction of bell in figure 4 and/or semicircle 
in figure 5. 
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26. Curvature flattening (figures 4, 5). Marked 
flattening of bell in figure 4 and/or semicircle in 
figure 5. 

27. Asymmetry (figures A, 3, 4, 5, 7, 8). A sym- 
metrical part of a figure is reproduced asymetri- 
cally. Lines equal in length are made unequal, 
or corresponding angles equal in size are made 
unequal, or regular contours are made irregular. 

28. Disproportion (figures A, 4, 5, 6, 7, 8). Dis- 
tortion in the relative size of the parts of a figure. 
Parts of equal length, height, or area are made 
unequal. 

2g. Displacement (figures A, 3, 4, 5, 6, 7, 8). 
Shilting of the point of juncture of one part of 
a figure with the other, either to the left or right 
or above or below. In the case of figure 5, junction 
of the tangential line at the center of the semi- 
circle is scored, as well as junction at the extreme 
right. Scoring for figure 6 is strict; the vertical 
line must cross the horizontal at the middle of 
the third horizontal wave; displacement is also 
scored if the horizontal line crosses the vertical 
in the trough at either end of the second (from 
the top) vertical wave. In the case of figure 7, 
displacement is scored when one hexagon is 
drawn well below the other or when the lower 
point of the left hexagon is not below the lower 
point of the right hexagon. 

go. Contiguity (figure A, 4, 5, 7, 8). Contiguous 
parts of a figure overlap or are only partially 
contiguous, or overlapping parts are made con- 
tiguous. 

31. Angulation (figures 2, 3, 4, 5, 6, 7). Distor- 
tion of the angular direction of a figure or of 
the angle formed by one part of a figure with the 
other. 

32. Concentric arc (figure g). Two or more of 
the arrowheads in figure 3 are reproduced as 
arcs. 

33. Parallel lines (figures 4, 7, 8). Parallel lines 
are reproduced as converging or diverging. 

34. Parallel columns (figure 2). Slanting col- 
umns of figure 2 are reproduced as converging or 
diverging. Also scored for angulation (31-2). 

35. Parallel rows (figure 2). The parallel rows 
of figure 2 are reproduced as converging or di- 
verging. 

36. Horizontal irregularity (figures 1, 2). The 
horizontal progression of dots or columns is ir- 
regular and uneven rather than correctly aligned 
or consistent in direction. 

37. Irregular spacing (figures 1, 2, 3). The dis- 
tances between dots, circles, rows, or columns are 
markedly unequal. 

38. Perseveration (figures 1, 2, 3). Tendency to 
extend a figure indefinitely. There must be at 
least four additional dots in figure 1, four addi- 
tional columns in figure 2, and one additional 
column in figure g. Also scored for numeration 
(category 20). 

39. Serial incompleteness (figures 1, 2, 3). Tend- 
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ency to reduce the number of elements in figures 
1, 2, and 3. Scored when there are nine or less 
dots in figure 1, eight or less columns in figure 2, 
three or less columns in figure 3. Also scored for 
numeration (category 20). 

40. Closed-open (figures’ A, 4, 7, 8). Enclosed 
figures or parts of figures are not completely 
closed in the reproduction because of the pres- 
ence of gaps. One instance is sufficient to score. 

41. Open-closed (figures 4, 5). Open figures or 
parts of figures are closed or nearly closed in the 
reproduction. 

44. Splitting (figures A, 4, 7). The parts of a 
figure, joined in the stimulus figure, are repro- 
duced so that they are separated by a narrow 
space (less than 4 inch). 

45- Dissociation (figures A, 4, 5, 7). The parts 
of a figure, joined in the stimulus figure, are re- 
produced so that they are separated by a space 
of 14 inch or more. 

46. Simplification (any figure). The reproduc- 
tion is made considerably less complex than the 
original, usually by substitution of solid for 
dotted lines, straight lines for angles, loops or 
curved lines for angular figures, or straight lines 
for wavy lines, Also scored for distortion (cate- 
gory 47). 

47. Distortion (any figure), Marked deviation 
of the reproduction from the original, including 
simplification, addition or omission of angles, 
massing of dots, marked line irregularity, closure 
of open figures, etc. 

48. Total rotation (any figure). The entire re- 
production is rotated 45 degrees or more from the 
orientation of the stimulus figure. Turning of the 
stimulus card or of the paper with corresponding 
orientation of the reproduction is not scored, 

49. Part rotation (figures A, 4, 5, 7, 8). Rotation 
of 45 degrees or more of one of the parts of a 
figure. An exception is figure 5, which is scored 
for a rotation of the semicircle of 30 degrees or 
more. 

50. Reversal (figures A, 4, 7). The spatial posi- 
tion of the two parts of a figure are reversed so 
that the left part is reproduced on the right and 
the right part on the left. 

51. Paper rotation (any figure). The paper is 
turned by the subject to a horizontal position 
while one or more figures are reproduced. A 
reproduction so made is not scored for total ro- 
tation (category 48) if it is correctly oriented 
with respect to the horizontal orientation of the 
paper. 

52. Card rotation (any figure). The stimulus 
card is turned by the subject at least go degrees 
in any direction. A reproduction made under 
these conditions is not scored for total rotation 
(category 48) if it is correctly oriented with re- 
spect to the orientation of the card. 

53. Fragmentation (any figure). Only an_ iso- 
lated part, or fragment, of the total figure is re- 
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produced. In the cases of figures 1 and 2, four 
dots or less and three or less columns are so 
scored; in these two instances, serial incomplete- 
ness (category 39) would also be scored. 

54. Omission (figures A, 2, 3, 4. 5, 6, 7, 8). 
Omission of a major part of a figure in the re- 
production. More of the figure is present, how- 
ever, than would be true if fragmentation were 
scored. 

55. Upward slope (figures A, 1, 2, 3, 6, 8). The 
reproduction inclines upward. Scoring is strict; 
any perceptible upward slant is scored, even if 
slight. The horizontal line of figure 6 is normally 
slanted in the stimulus card but is scored if also 
slanted in the reproduction, since most subjects 
fail to observe slant in that figure. Scored also for 
counterclockwise rotations up to 45 degrees, 

57. Direction: sinistrad (figures 1, 2, 3, 4, 5, 6). 
Horizontally oriented stimulus figures are repro- 
duced from right to left. In figure 4, either the 
open square or the bell, or both, may be made 
from right to left. In figure 5 only the semicircle, 
and in figure 6, only the horizontal line are scored. 

58. Direction; upward (figures 2, 3, 6). A figure, 
part of a figure, or a line is drawn in the upward 
direction, i.e., from below to above. In figures 2 
and g, any one column so drawn is sufficient for 
scoring. 

59. Direction: centrifugal {figures 4, 5). The 
bell of figure 4 or the semicircle of figure 5 is 
made by starting at the crest of the curve and 
drawing lines from the center of the curve to each 
end. 

60, Counting: test figure (figures 1, 2, %, 5, 6). 
Subject overtly counts the number of dots, circles, 
columns, rows, or waves in the stimulus figure, as 
indicated by moving pencil or finger over each 
element of the figure, counting aloud, moving 
lips while counting subvocally, ete. 

61. Counting: reproduction (figures 1, 2, 3, 
5, 6). The same as category 60, except that the 
elements of the reproduction, rather than the 
stimulus figure, are counted. 

62. Recounting: test figure (figures 1, 2, 3, 5, 6). 
Subject overtly counts the elements of the stimu- 
lus figure, as defined in category 60, two or more 
times. 

63. Recounting: reproduction (figures 1, 2, 3, 5, 
6). The same as category 62, except that the ele- 
ments of the reproduction, rather than the 
stimulus figure, are overtly counted two or more 
times. 

64. Rows (figure 2). Figure 2 is reproduced row 
by row rather than column by column, Making 
the top row only and then completing the 
columns is not scored, 

65. Initial part (figures A, 4, 5, 6, 7, 8). This 
refers to the part of the figure which is repro- 
duced before the other part is made. Scored only 
if the part indicated below is made first: 

figure A—diamond 
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figure 4—bell 
figure 5—handle 

figure 6—vertical line 

figure hexagon 

figure 8—enclosed diamond 

66, Angle differentiation (figure 7). The two 
angles at the extremes of either hexagon are 
made essentially alike rather than as acute and 
obtuse. 

67. Leg shortening (figure 5). One “leg” of the 
semicircle is shorter than the other. The angle 
formed between the horizontal and an imaginary 
line connecting the two ends of the legs should 
be at least 10 degrees. An angle of 45 degrees or 
more is scored for both leg shortening and part 
rotation (category 49). 

68. Side reduction (figures 7, 8). A six-sided 
figure is made with five or less sides. Scored also 
for distortion (category 47). 

72. Correction (all figures), An inaccuracy in 
the reproduction is rectified by drawing another 
line without erasing the incorrect one, by cross- 
ing out a part of the figure, or by partially re- 
tracing the incorrect part. 

73. Multiple stroking (figures A, 4, 6, 7, 8). 
‘The subject takes two or more strokes to make 
a line which is usually made by one continuous 
stroke. Sketching is excluded from this definition, 
however. 

74. Projection angulation (figures 7, 8). The 
sides of the angles forming the pointed projec- 
tions in figures 7 or 8 are made unequal, lop- 
sided, or asymmetrical; the points of the angles 
are blunted or otherwise distorted; or, in the 
case of figure 8 only, the corresponding angles at 
the extremes of the hexagon are unequal. Scor- 
ing is strict; slight deviations are scored. 

75. Line sag (figures A, 4, 7, 8). Straight lines 
are reproduced as curved, sagging, convex, or 
concave. 

77. Curvature; horseshoe (figure 5). The legs 
of the semicircle in figure 5 curve in toward each 
other, resembling a horseshoe. 

78. Continuation (figures 1, 2). Figure 1 or 2 
is made on two lines, usually when the repro- 
duction is incomplete at the right-hand margin. 

80, Curve squaring (figures 4, 5). The bell of 
figure 4 or the semicircle of figure 5 resembles 
an open square, 


B. GLoBAL SIGNS 


Sequence, Order in which the reproductions 
are placed on the page. Scored in terms of the 
number of sequential connections. A sequential 
connection refers to any two reproductions which 
follow one another in the same order as the 
presentation of the stimulus cards. A reproduc- 
tion may be directly above, below, to the right, 
or to the left of the one which it follows. A 
connection exists only when no other figure in- 
tervenes between the two successive reproduc- 
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tions. If a figure is near the right-hand margin 
and is followed by its successor near the left- 
hand margin below, a connection is scored. If 
a figure is at the bottom of the left half of the 
page and is followed by its successor at the top 
of the right half of the page, or vice versa, a 
connection is scored. If a figure is at the bottom 
of a page and is followed by its successor on the 
obverse side of the paper or on a new sheet, a 
connection is scored. 

Rigid sequence. Each reproduction is placed 
directly below the preceding one. The repro- 
ductions may all be placed on one side of the 
page or continued from the front side to the 
obverse side or to a new sheet of paper. Not 
scored, however, if the reproductions are placed 
on one side of the page in two columns. Also 
scored for 8 sequential connections. 

Cohesion. Vhe degree of compression or ex- 
pansion of all the reproductions, expressed in 
terms of the amount of space filled, as follows: 

Cohesion: 14. All of the figures are reproduced 
in the upper, middle, or lower third of one 
page. 

Cohesion: 1/4. All of the figures are reproduced 
in the upper or lower half of one page. 

Cohesion: 24. All of the figures are reproduced 
in the upper or lower 2% of the page. 

Cohesion: 1. All of the reproductions occupy 
a full page. 

Cohesion; 11%. Reproductions occupy all or 
part of one page plus 14 of the back of the 
page or 14 of a new sheet. 

Cohesion: 114. Reproductions occupy all or 
part of one page plus 14 of the back of the 
page or 14 of a new sheet. 

Cohesion: 124. Reproductions occupy all or 
part of one page plus % of the back of the 
page or 24 of a new sheet. 

Cohesion: 2. Reproductions fully occupy the 
front and back of a page or one side of one 
sheet and one side of a second sheet. 

Cohesion: 24-. More than 2 pages (front and 
back of one sheet or one side of each of 
two sheets) are used to reproduce the figures. 

Second sheet. Use of more than one sheet of 
paper on which to place the reproductions. Re- 
fers to the use of two or more separate sheets, 
not to the front and back of a single page. 

Collision. Two or more reproductions overlap 
or run into one another, 

Numbering. One or more of the reproductions 
are numbered by the subject. 

Compartments. One or more lines are drawn 
to separate the reproductions. 

Total time. Sum (in seconds) of the times re- 
quired to complete each of the reproductions. 

Time: 139. Scored when the total time is less 
than 140 seconds. 

Short figures. Presence of Owe or more repro- 
ductions equaling or falling below the 1oth 
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TABLE 1 
CriticaAL LENGTHS AND HEIGHTS FOR SHORT, LONG, FLAT, AND TALL FIGURES 


Flat Figures Tall Figures 


Short Figures Long Figures 


Initial Retest itis Retest Initial Retest Initial Retest 
(mm.) ( i (mm.) (mm.) (mm.) (mm.) (mm.) 


15 32 
13 q 26 
22 40 
35 I 57 
32 60 
55 96 
39 68 
13 24 


TABLE 2 


CriticaL RATIOS OF SIGNS SIGNIFICANT AT THE 5 PER Cent LEVEL ON EITHER 
THE INITIAL TEST OR THE RETEST 


Critical Ratio Critical Ratio 
Sign - Sign — —— 
Initial i Retest Initial 


Retest 


Global Individual 
Second sheet . Insig. 80-4 
Short figures : ; 14-5 
Cohesion: 4 : 17-5 
Time: 139 ; 27-5 


Insig. +2.07 
+2.30 Insig. 
Insig. —2.42 
— 2.96 —2.42 
29-5 70 —4.62 
Individual 31-5 Insig. —2.65 
40-5 —2.39 Insig. 
57-5 —2.11 —2. 
59-5 Insig. —2. 
60-5 —3. 
20-6 
22-6 
23-6 Insig. 
29-6 —4.99 
Insig. | 31-6 +-2.67 
— 2.05 47-6 Insig. 
—2.39 58-6 | +2.57 
+2.63 | 60-6 —3.30 
Insig. \| 61-6 —2.26 
— 3.76 5-7 Insig. 
Insig. 10-7 —2.31 
Insig. 27-7 — 2.30 
Insig. 30-7 Insig. 
Insig. —2.14 65-7 | +2.52 
+3.11 Insig. 66-7 | —3.49 
— 3.13 68-7 Insig. 
—2.39 | 72-7 Insig. 
Insig. 27-8 | —3.99 
| 


7 
73 
73 


Insig. 30-8 Insig. 
—2.18 33-8 Insig. 
+7.74 40-8 Insig. 
+2.99 72-8 — 4.05 
— 3.28 74-8 —2.76 Insig. 
— 3.36 75-8 Insig. —2.49 


Note: A minus sign before the critical ratio indicates that the scoring sign occurs more frequently in 


the total neurotic group (N = 108); a plus sign indicates a greater incidence in the total control group 
(N= 285). 


11 
Figure 
27 29 58 55 3t 
69 7° 150 146 —— 
86 82 17! 168 25 
25 25 53 54 47 
33 3! 55 §2 54 
30 31 57 56 58 
86 82 156 147 92 
3° 3° 55 53 63 
63 58 108 104 21 
| 

) 

72-4 Insig. 

75-4 | 


percentile of the total control group in length. 
For the initial test and the retest, the critical 
lengths are given in Table 1. 

Long figures. Presence of one or more repro- 
ductions equaling or exceeding the goth per- 
centile of the total control group in length. For 
the initial test and the retest, the critical lengths 
are given in Table 1. 

Flat figures. Presence of one or more reproduc- 


A. GLOBAL AND INDIVIDUAL FiGURE SIGNS 
ON THE INITIAL ‘Test AND THE RETEST 

As a first step in the analysis of the 
Bender-Gestalt Test records, the per- 
centage incidence of each scoring sign 
was found separately for the total con- 
trol and total neurotic groups. Critical 
ratios of the differences between each pair 
of percentages were then computed. 
Table 2 presents those global and indi- 
vidual figure signs yielding critical ratios 
significant at the 5 per cent level or better 
on either the initial test or the retest. 
Only four of the twelve global signs 
(second sheet, short figures, cohesion: 1/4, 
and time: 139) are significant at the 5 
per cent level on the initial test, and this 
number is reduced to two (short figures, 
cohesion: 14) on the retest. ‘The order 
of placement of the reproductions on the 
page (connections, collision, rigid se- 
quence), numbering the reproductions, 
placing the reproductions in compart- 
ments, variations in height, and exces- 
sive length are nondiscriminating vari- 
ables. Most consistently significant are 
spatial compression shortness of 
length of the reproductions. Of the in- 
dividual figure signs, 40 are significant on 
the initial test and 41 on the retest. 

In order to develop a scoring system 
which would be intrinsically stable and 
reliable rather than dependent upon the 
summation of chance differences, a sign 
was required to satisfy either of the fol- 
lowing criteria before being selected for 
inclusion in the final scoring scheme: 
(a) discrimination between the total con- 
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tions equaling or falling below the 1oth per- 
centile of the total control group in height. For 
the initial test and the retest, the critical heights 


. are given in Table 1. 


Tall figures. Presence of one or more reproduc- 
tions equaling or exceeding the goth percentile 
of the total control group in height. The critical 
heights for the initial test and the retest are 
given in Table 1. 


trol and total neurotic groups at the 5 
per cent level (critical ratio of 1.96) on 
either the initial test or the retest, and 
at the 10 per cent level (critical ratio of 
1.65) on the other, or ()) discrimination 
at the 1 per cent level (critical ratio of 
2.58) on either test and consistency in the 
direction of the difference on the other. 
Thirty-one individual scoring signs and 
three global signs satisfy these criteria. 
Since, however, the two major cri- 
terion groups differ somewhat with re- 
spect to age composition, educational 
background, and marital status, even 
though they are essentially alike with 
regard to nationality, race, sex, and 
military service, the possibility cannot 
be ignored that differences in these three 
variables may conceivably account for 
some of the obtained differences. In 
order to give due consideration to this 
possibility, an attempt was made to match 
as many of the neurotic subjects as pos- 
sible with an equal number of subjects 
from the below-7o group with respect 
to the three variables under discussion. 
The below-7o group was selected as the 
contrasting group because it satisfied 
both the clinical and psychometric cri- 
teria of normality adopted for this in- 
vestigation. A total of 138 subjects was 
so matched—6g neurotics and 69 controls 
—and will be referred to hereafter as the 
neurotic matched criterion group and 
the below-7o matched criterion group. 
While perfect matching could not be 
achieved because of the requirement that 
each case be matched with respect to 
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TABLE 3 


AGE, GRADE, AND MarITAaL Status CoMPosITION 
OF THE MATCHED CRITERION GROUPS 


Neurotic 


(N=69) 


Below 70 
(N =69) 


Age (years) 
“Mean 
SD 
Education (grades) 
Mean 
SD 


27.26 
4.70 


27.84 
4.34 


12.19 


Marital Status 
Single 
Married 
Divorced 
Separated 


three variables simultaneously, the suc- 
cess of the matching technique is evident 
from Table 3. 

The percentage incidence of the 34 
tentatively selected scoring signs was 
found separately for the normal and 
neurotic matched groups. Critical ratios 
of the differences between each pair of 
percentages were then computed. Since 
the number of cases is smaller in the 
matched groups than in the original 
groups, the critical ratios tend to be 
smaller even when the absolute differ- 
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ences are identical. As a basis for in- 
clusion of the signs in the final scoring 
system, therefore, the criterion set for 
this second weeding-out process was 
somewhat more lenient, namely, discrim- 
ination at the 5 per cent level on either 
the initial test or the retest and con- 
sistency in the direction of the difference 
on the other. ‘Table 4 presents the initial 
and retest critical ratios of the 27 indi- 
vidual figure signs and the three global 
signs which survived both screening pro- 
cedures. 

The go signs which constitute the final 
scoring system may be grouped as 
follows: 


Description 
Numeration (number of wave crests 
in figure 6) 
Wave flattening in figure 6 
Asymmetry in figure 5 
Displacement in figures 5 and 6 


Contiguity in figure A 
Parallel lines in figure 8 


*Signs followed by a plus occur more fre- 
quently among normals than among neurotics; 
all other signs discriminate in the opposite direc- 
tion, i.c., are more characteristic of neurotics. 


TABLE 4 


Critical Ratio 


Sign 


Initial Retest 


Critical Ratio 


Initial 


.81 
.68 
34 
-35 
.00 


Short figures 
Cohesion: 4 
Time: 139 
20-6 


22-6 
27-5 
20-5 
29-6 
30-A 
33-8 
55-1 
55-2 
57-4 
70-1 
72-4 


+77 


Note: A minus sign before the critical ratio indicates that the scoring sign occurs more frequently 
in the neurotic group (N = 69); a plus sign indicates a greater incidence in the below-70 group (N = 69). 


12.19 
2.05 
37 32 
32 33 
° I 
° 3 
Sign* 
20-6 
22-6 
27-5 
29-5 
29-6 
33-8 
CriticaL RATIOS OF THE SELECTED SIGNS FOR THE MATCHED CRITERION GROUPS 
| Retest 
—2.46 —2 57-5 —2 —2.06 
—2.58 —2 58-6 | +1.31 
+1.96 +o 60-1 —2.44 —2.67 
—2.79 60-2 —3.20 —1.g0 
—0.65 -3 60-3 —4.76 —2.30 
—2.56 —-o 60-5 —4.73 —3.62 
—$.08 —1.19 60-6 —2.92 —3.22 
—0.87 —1.98 61-1 —2.25 
—0.40 —3.590 61-2 —2.63 —1.93 
—1.96 —1.88 61-3 —2.97 —1.83 
—2.05 0.00 61-6 —3.01 
—1.66 —2.77 65-7 +3.50 +2.45 
+4.02 +3.59 66-7 —3.20 —4.33 
+0.75 +2.28 75-A —0.68 —2.51 
—0.85 —3.24 | 75-4 — 3.0% —2.50 
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Description 
Upward slope in figures 1 and 2 


Sinistrad direction in figures 4 and 5 


Counting of test figure in figures 
1, 2, 3, 5, and 6 


Upward direction in figure 6 
Counting of reproduction in figures 
1, 2, 3, and 6 


Initial part (left hexagon made first) 
in figure 7 

Angle differentiation in figure 7 

Pairing of dots in figure 1 

Correction in figure 4 

Sagging line in figures A and 4 


Spatial compression 

One or more reproductions of ab- 
breviated length 

Total time less than 140 seconds 


Short figures 
Time: 139+ 


B. THe ‘Tesr 


The mean number of figures recalled 
by the total control group is 6.28, com- 
pared with a mean ‘of 5.93 for the totat 
neurotic group. The standard deviations 
are 1.15 and 1.60, respectively, and the 
critical ratio of the difference between 
means is only 1.79. There is, however, a 
tendency for the neurotic group to con- 
tain more instances of extreme paucity 
of recall. Thus 12.0 per cent of the neu- 
rotics recall 314 figures or less compared 
with 1.8 per cent of the controls. If one 
focuses on the frequency of recall of each 
of the figures rather than on the total 
number of figures recalled, only figures 
A and 6 show differences in percentage 
incidence significant at the 5 per cent 
level (critical ratios of 2.29 and 2.24, 
respectively). 

Inspection of a number of recall test 
records suggested that when the figures 
were reproduced from memory, certain 
modifications were present which either 


infrequently or never occurred during 
the initial and final tests (when the re- 
productions were copied from models). 
‘These changes were classified as distor- 
tions, confusions, half figures, closures, 
and rotations. Distortions occur very 
frequently regardless of the group to 
which the subject belongs, 57 per cent 
of the below-7o matched criterion group 
and 64 per cent of the neurotic matched 
criterion group having one or more dis- 
torted reproductions in their recall rec- 
ords. The differential incidence of all 
special modifications is, however, sta- 
tistically insignificant, except for closure 
on figure 4. 

Only six signs meet the criterion set 
for the recall test, namely, discrimination 
between the total control and total neu- 
rotic groups at the 5 per cent level. These 
signs and their critical ratios (of the 
differences between percentages) are 
shown in ‘Table 5. Closure-4 refers to 
reproducing the open square in figure 4 
as a closed square. A (5-9) refers to recall 
of figure A as the fifth, sixth, seventh, 
eighth, or ninth figure recalled by the 
subject, while 6 (1-4) refers to recall of 
figure 6 among the first four figures re- 
called. Short tigures-4 refers to the pres- 
ence of four or more reproductions at or 
below the 1oth percentile of the initial 


TABLE 5 


Critica Ratios oF SIGNS INCLUDED IN THE 
Finat SCORING SYSTEM FOR THE RECALL TEST 
Sign Critical Ratio 

—2.13 

+2.21 

— 2.69 

+3.58 

+4.24 

—3.16 


Cohesion: 
Closure—4 
Short figures-4 
A (5-9) 

6 (1-4) 

3} or less 


Note: A minus sign before the critical ratio in- 
dicates that the scoring sign o¢curs more fre- 
quently in the total neurotic group (N =108); a 
plus sign indicates a greater incidence in the total 
control group (V = 285). 


) 

Sign 

55-2 

57-5 

60-2 

60-3 

60-5 

60-6 

61-1 

61-2 

61-3 

61-6 

65-7+ 

66-7 

7O-1+ 

72-4 

75-4 

4 
/ / 
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TABLE 6 
MEANS AND CRITICAL RATIOS OF THE DIFFERENCES BETWEEN MEANS 


OF THE INITIAL, RETEST, AND COMBINED SCORES 


Initial Score 


Retest Score Combined Score 


Group | 
Siean 


Below 70 (matched) 
Neurotic (matched) 


Below 70 (unmatched) 
Neurotic (unmatched) 
Below 70 

1 above 70 

Below 70 

2 above 70 


1 above 70 

2 above 70 
Total control 
Total neurotic 


test reproductions of the total control 
group in length. The recall of less than 
four figures by the subject is designated 
by the sign, 314 or less. 


C. SCORING OF THE TEST RECORDS OF THE 
CRITERION GROUPS WITH THE 
SELECTED SIGNS 


Following the isolation of those signs 
on the initial test, recall test, and retest 
which individually discriminated — be- 
tween the normal and neurotic subjects, 
the next step was to rescore all the rec- 
ords for these “selected signs,” add the 
signs algebraically, and tabulate the dis- 
tributions of these summated scores. A 
combined score, defined as the sum of the 
initial and retest scores of each subject, 
was also obtained. The presence of a 
sign which occurs more frequently among 
neurotic subjects is indicated by a minus 
sign, and the presence of a sign occurring 
more often among the controls is desig- 
nated by a plus. The sums of these signs 
appear without sign when the number of 
minus signs outweighs the plus signs, and 
appear with a plus sign when the num- 


CR 


Mean Mean 


ber of plus signs outweighs the negative 
signs. If, for example, a given test record 
has six negative signs and three positive 
signs, the sum is designated as 3; should 
the record contain four positive signs 
and three negative signs, however, the 
sum would be designated as + 1. The 
results of this procedure for the matched 
and unmatched criterion groups, the 
three control subgroups, and the total 
control and total neurotic groups are 
presented in ‘Table 6, which gives the 
means and critical ratios of the difference 
between means of the initial, retest, and 
combined scores. 

All the differences between the control 
and neurotic groups are highly signifi- 
cant, and overlapping of the distribu- 
tions of scores is not great. In the case 
of the below-7o and neurotic matched 
groups, for example, an initial test score 
of 5 screens out 88.4 per cent of the neu- 
rotics at the expense of only 26.1 per cent 
false positives; a retest score of 4 screens 
out 81.2 per cent of the neurotics with 
only 23.2 per cent false positives; and 
a combined score of 8 eliminates 89.0 


| 
| 69 3.07 2.10 S87 
86 3.56 | 2.55 6.12 | Fe 
| 39 | 6.77 7-96 15.36 
| 55 3-34 2.35 5-7° 
| 68 | 3.82 | 6.57 
| 155 et 2.35 | 5.70 
| 62 3.89 1.38 83 7.11 
| | 3-% 3-23 | | 7.11 66 
285 3.58 2.63 6.26 
| 108 | 8.26 ot 6.59 aia 14.85 14.08 
- 


16 WALLACE GOBETZ 


per cent of the neurotics at a cost of 
23.2 per cent of the controls. None of 
the differences between the three con- 
trol subgroups is significant at the 1 per 
cent level, although the trend of the 
means is in accord with expectation. With 
regard to the recall scores, the means of 
the below-7o and neurotic matched 
groups are +.8g9 and .o7, respectively, 
a difference yielding a critical ratio of 
4-82. This difference is much smaller 
than that obtained for the initial, retest, 
and combined scores. Various combina- 
tions of the recall score with each of 
these scores was tried, but the general 
finding was that the recall score did not 
sufficiently increase the  discriminat- 
ing power of the test to warrant its in- 
clusion in a more complex composite 
score. 


D. Errecr oF EpuCATION, AGE, AND 
INTELLIGENCE UPON ‘TEST SCORES 


The similarity of the results for 
matched and unmatched groups suggests 
that such factors as education and age 
bear little or no relationship to the Ben- 
der-Gestalt Test scores of adult subjects. 
One would therefore expect that intel- 
ligence would be uncorrelated with 
Bender-Gestalt Test performance. ‘These 
expectations are confirmed by Table 7, 
which summarizes the correlations of 
these three variables with the initial and 
retest scores of the total control and 
total neurotic groups. 


E. ScorING OF THE INITIAL TEST RECORDS 
OF THE MATCHED CRITERION GROUPS 
WITH PASCAL AND SUTTELL’S SIGNS 


Pascal and Suttell (35) are the only 
investigators other than the present 
writer who have combined Bender-Ges- 
talt signs into a total score which, it is 
claimed, will differentiate normals from 
abnormals with a reasonably high degree 
of accuracy. Pascal and Suttell claim 
that their scores will distinguish normals 
from neurotics, normals from psychotics, 
and neurotics from psychotics, although 
the latter differentiation is admittedly 
more subject to error. From the stand- 
point of the subject of the present paper, 
the main interest centers around the rela- 
tive effectiveness of Pascal and Suttell’s 
signs in distinguishing between the nor- 
mal and neurotic cases employed in the 
present investigation. This question as- 
sumes significance in view of the fact 
that many of the signs used by Pascal 
and his collaborator were identical or 
similar to signs tentatively tried out in 
the present study but rejected for the 
final scoring system because of lack of 
statistical validity. The below-7o and 
neurotic matched criterion groups were 
selected for this check because they are 
sharply contrasting groups and because 
such factors as race, nationality, sex, age, 
education, and marital status were held 
constant, eliminated, or neutralized. 

Accordingly, the present writer learned 
Pascal and Suttell’s scoring scheme and 


TABLE 7 


CorRELATIONS OF GRADE, AGE, AND INTELLIGENCE TEST SCORES 
WITH INITIAL AND RETEST SCORES 


Total Control 


Total Neurotic 


Initial 


Initial Retest 


— .1rt.04 
06 + .04 


—.12+.06 
.20+ .06 


— .06+ .07 
.16+ .06 


— .10+ .07 — .07 


N Retest N 
Grade 282 — .10+ .04 108 
Age 277 .04 107 
Otis 285 —.17+.04 
W-BI 84 
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TABLE 8 


Raw Score MEANS AND CRITICAL RATIOS OF THE DIFFERENCES BETWEEN MEANS ON THE INITIAL 
TEST FOR THE MATCHED Groups, SCORED ACCORDING TO PASCAL AND SUTTELL’S SYSTEM 


Grade 


Critical 
Neurotic Ratio 


(N =69) 


8-11 
12 
College 


28.60 
27-55 
29.4e .62 


Total group 


28.45 1.38 


scored several of the test records which 
appear in their book until his total 
scores closely approximated those given 
in their scoring manual. All 138 cases in 
the matched groups were then scored in 
accord with this method. The raw score 
means and critical ratios of the dif- 
ferences between means are given in 
‘Table 8. 

Education is unrelated to the size of 
the score of either the controls or the 
neurotics, as is evident from the fact 
that subjects who did not receive high 
school diplomas, high school graduates, 
and college trained subjects obtain al- 
most identical mean scores within the 
criterion group of which they are mem- 
bers. Pascal and Suttell claim that educa- 
tion is a significant factor and present 


separate normative data for college and 
non-college trained individuals. Of much 
greater significance, however, is the fail- 
ure of their scoring scheme to distinguish 
between any of the normal and neurotic 
groupings. 

Pascal and Suttell’s administration of 
the Bender-Gestalt differs from that em- 
ployed in the present study in two re- 
spects as far as the initial test is con- 
cerned: (a) the time required to repro- 
duce the figures is not recorded, and (b) 
sketching is prohibited. However, sketch- 
ing occurs relatively infrequently in the 
matched group cases, and it is difficult 
to see how the single factor of recording 
time could wipe out the presumed dif- 
ferences in test performance which are 
implicit in their method of scoring. 


V. THE CROSS-VALIDATION 
STUDY 


If Pascal and Suttell’s scoring scheme 
holds up so poorly under the strain of 
cross validation, one may well speculate 
as to the fate of the scoring method de- 
veloped in the present investigation 
under similar circumstances. The pro- 
cedure of scoring the test records of the 
very subjects upon whom the item analy- 
sis is based is open to the objection that 
the summation of chance differences may 
result in the spuriously high differential 
that cannot subsequently be duplicated 
when applied to other samples drawn 
from the same populations. 


Ideally, it would have been desirable 
to have obtained samples of normal and 
neurotic veterans drawn from the same 
populations upon which the item analysis 
was performed, but this procedure was 
not feasible because of the virtually in- 
surmountable difficulty of obtaining ad- 
ditional neurotic cases from the Veterans 
Administration mental health clinics in 
New York City during a period of peak 
load. The thought then occurred that the 
cross-validation study might profitably 
be conducted with nonveteran male 
adults differing markedly in personal 


Mean — 
| Below 70 
(N =69) | 
25.15 | 
24.20 
27-45 
25-55 
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adjustment, for if the test held up under 
these conditions the conclusions con- 
cerning validity would not have to be 
restricted to the veteran population but 
would have a wider applicability to the 
male population at large. 

It was decided to select the cases from 
the files of the New York University 
‘Testing and Advisement Center, a testing 
and vocational guidance agency that 
serves the metropolitan area and suburbs. 
‘The vast majority of clients 18 years of 
age and over who avail themselves of 
the Center's services desire educational 
or vocational guidance. However, a cer- 
tain proportion of these individuals are 
subsequently found to have personality 
problems sufficiently severe to warrant 
the recommendation that they undergo 
psychotherapy as an initial step in the 
solution of their vocational problems. 


Education 


Mean 


TAC Male 
Satisfactory 
adjustment 


Unsatisfactory 


adjustment 29 14.41 1.67 
TAC Total Male 
Satisfactory 
adjustment 39 14.36 1.48 
Unsatisfactory 
adjustment 43 13.98 5.9% 
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TABLE 9 
COMPOSITION OF THE CROSS-VALIDATION GROUPS 


(grade) SD 


TAC Total Female 
Satisfactory 
adjustment 


Unsatisfactory 
adjustment aI 13.7% 1.64 


TAC Total 
Satisfactory 
adjustment 


Unsatisfactory 
adjustment 


The testing and counseling procedures 
consist of (a) an initial interview, in 
which personal history data regarding 


education, employment — experience, 
health, family background, marital ad- 
justment, interests, hobbies, special tal- 
ents, and personal and social adjustment 
are obtained; (b) 10 to 16 hours of group 
and individual testing, including the 
Wechsler-Bellevue, educational achieve- 
ment tests, aptitude tests, interest tests, 
personality inventories, and projective 
personality tests; (c) one or more counsel- 
ing conferences in which the findings 
are presented to the client and plans 
for the future are mutually worked out; 
and (d) a detailed written report which 
integrates biographical and test data and 
recapitulates the major findings. Since 
comparatively few of these clients had 
educations below the high school level, 


Marital Status 


Mean | % | % 
(years) = Married | Single | Other 


69.0 


27.56 8.42 20.9 74.4 4.9 


| 
la 29 14.41 1.67 25.56 | 8.04 | 31.0 || 0.0 
| | 
25.68 | 7.60 | 20.7 79-3 | 0.0 
25.42 7.34 33-3 | 66.7 | 0.0 
po 15 | 13.07 | 1.34 | 26.60 8.36 | 20.0 | 73.3 | 6.7 
| | | 
33.10 8.82 | 33-3 | 61.9 4.8 
| | 
54 14.00 3.85 25.74 7.66 29.6 | 68.5 | 1.9 
| | 
pe 64 13.89 1.69 | 29.38 | 8.94 25.0 | 70.3 | 4-7 
/ 
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the cross-validation cases were selected 
from a file of approximately one thou- 
sand clients ranging in age from 18 to 50 
and in education from four years of high 
school through college. Each of these 
cases was evaluated from the standpoint 
of whether the subject was effecting an 
unusually good or an exceptionally poor 
emotional and social adjustment. Only 
when personal history data, Wechsler- 
Bellevue scatters, and personality tests 
pointed consistently in one direction or 
the other and only when these data were 
consistent with the impressions of the 
psychologist who originally handled the 
case was the subject selected for inclusion 
in the cross-validation groups. ‘The Ben- 
der-Gestalt Test records were removed 
from the folders prior to this appraisal 
so that they would not influence the final 
evaluation. The Bender-Gestalt ‘Tests 


were in all instances administered with 
the same instructions that had previously 


been used for the veteran subjects but 
they had not been scored, since no scor- 
ing scheme was available at the time. 
Most of the poorly adjusted subjects who 
were selected would probably have been 
diagnosed as neurotic by a psychiatrist 
(no cases in which there was any sus- 
picion of psychosis were included), but 
the question of diagnosis was avoided by 
merely designating the cases as satisfac- 
torily or unsatisfactorily adjusted. 

A total of 118 cases which satisfied the 
above criteria was finally selected. Of 
these 54 (39 males and 15 females) were 
designated as the TAC total satisfactory 
adjustment group, and 64 (43 males and 
21 females) as the TAC total unsatisfac- 
tory adjustment group. (“TAC refers 
to “Testing and Advisement Center.’’) 
Originally, it had been intended that 
only males would be included, but a 
smaller number of female records was 
added in order to shed some light on the 
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question of whether the method of scor- 
ing is valid for both sexes. These groups 
were subdivided into several subgroups 
which are described in ‘Table g with re- 
gard to size, education, age, marital 
status, and sex. As a first step, the male 
subjects were matched, pair by pair, for 
educational background, age, and mari- 
tal status. Matching was possible for 58 
of the 82 male cases, and the two groups 
so matched were designated as the TAC 
male satisfactory adjustment group (N = 
29g) and the TAC male unsatisfactory 
adjustment group (N = 29). The “PAC 
total male satisfactory adjustment” and 
“TAC total male unsatisfactory adjust- 
ment” groups refer to all the males in 
the cross-validation study; the “TAC 
total female satisfactory adjustment” 
and “TAC total female unsatisfactory 
adjustment” groups refer to all the fe- 
males. 

Tables 10 and 11 recapitulate the find- 
ings in the cross-validation study with 
respect to the initial, retest, and com- 
bined scores. ‘able 10 may be compared 
with Table 6, which presents analogous 
results for the original veteran groups. 
Shrinkage in the size of the difference 
between means (and hence in validity) 
is evident, but discrimination is still rela- 
tively good. For the entire group of 118 
cases, the initial and combined scores 
are roughly equivalent in discriminating 
power, and both are slightly superior to 
the retest. An initial score of 6 or above 
and a combined score of 10 or above each 
screen out 70.3 per cent of the poorly 
adjusted cases at the cost of 40.7 per cent 
false positives; a retest score of 4 or above, 
on the other hand, has a false positive 
rate of 51.9 per cent for a pick-up of 
71.9 per cent. With regard to the females, 
the initial score means of 4.60 and 7.14 
and the combined score méans of 6.93 
and 11.67 yield statistically significant 
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TABLE 10 


MEANS AND CRITICAL RATIOS OF THE DIFFERENCES BETWEEN MEANS OF THE INITIAL, RETEST, AND 


COMBINED SCORES FOR THE CROSS-VALIDATION GROUPS 


Initial Score 
Group N 


Retest Score Combined Score 


Mean CR 


Mean | CR Mean CR 


Satisfactory 
adjustment 


Unsatisfactory 
adjustment 


TAC Total Male 
Satisfactory 
adjustment 


Unsatisfactory 
adjustment 


TAC Total Female 
Satisfactory 

adjustment 15 4.60 

2.82 


Unsatisfactory 
adjustment 


6.93 


11.67 


TAC Total 
Satisfactory 
adjustment 


Unsatisfactory 
adjustment 64 7.42 


8.46 


6.02 13.44 


differences, but the retest score means of 
3-27 and 4.52 do not, although the dif- 
ference is in the expected direction. In 
view of the fact that two of the three 
kinds of scores shown to be significant 
for the males are also significant for the 
females, it is probable that the scoring 
method developed in this study is appli- 
cable to either sex and that the lack of 
completely parallel findings is attributa- 


Perhaps the most significant finding 
of the present investigation is that nor- 
mals and neurotics are much more alike 
than different with regard to perceptual- 
motor response to simple geometric de- 
signs, to the extent that such responses 
are measured by the reproduction of the 


VI. DISCUSSION 


ble to the small size of the female sam- 
ples. As expected from its comparatively 
poor showing with the original veteran 
groups, the recall test fails to hold up 
under cross validation. The means of the 
TAC total satisfactory adjustment and 
TAC total unsatisfactory adjustment 
groups are + .74 and + .62, respectively, 
and the critical ratio of the difference is 
only 0.64. 


Bender-Gestalt figures from a model or 
from memory. Only about one-sixth of 
the deviations for which the reproduc- 
tions were scored discriminate between 
the normal and neurotic subjects in the 
initial validation study, even when the 
comparatively lenient criterion of differ- 
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sss 44s 9.90 
2.96 3-66 3-64 
39 5-13 3-92 9-95 
3-80 4-15 4-53 
43 7-44 6.74 14.30 
3-27 
1.25 2.56 
7.44 
4.98 3:74 
4.98 4.03 5.08 
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TABLE 11 


CUMULATIVE PERCENTAGE DISTRIBUTIONS OF SCORES FOR 
ALL CASEs IN THE CROSS-VALIDATION STUDY* 


Initial Test 


Sat. 
S% 


Unsat. 


S% 


Sat. 
S% 


DHS 


OW 
HON ROS HHO 


* TAC total satisfactory adjustment group, N=54; TAC total unsatisfactory adjustment group, 
N=64. 


entiation at the 5 per cent confidence 
level is invoked, and the number of dis- 
criminating signs shrinks considerably if 
a more rigorous test of statistical signifi- 
cance is applied. This result is in essential 
agreement with that reported by Billings- 
lea (17), who found only 12 out of 63 
indices (1g per cent) significant at the 5 
per cent confidence level. It agrees also 
with Bender (12) and Woltmann (51) 
who, without reporting quantitative data, 
conclude that the Bender-Gestalt is of 
limited value in distinguishing normal 
from neurotic individuals. It is in marked 
disagreement with Hutt (28), who speaks 
of a “neurotic syndrome” on the Bender- 
Gestalt without, however, offering any 
statistical support for his conclusion, 


and with Pascal and Suttell (35), who 
report that 105 of their 200 signs (52.5 
per cent) differentiate between their pa- 
tient and nonpatient groups. It is only 
fair to point out that in the latter in- 
stance a sign is regarded as significant if 
it distinguishes normals from a combined 
neurotic and psychotic group of patients, 
and that their figure would be somewhat 
smaller if the psychotics were excluded 
from the comparisons. 

Modifications of the stimulus figures 
are frequent in both neurotic and normal 
subjects, according to the findings of the 
present study, but these deviations rarely 
approach the point where they can be 
described as gross destructions of the 
Gestalt, except when the reproductions 
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Pe | Retest Combined Score 
Score Sat. Unsat. Unsat. 
S% S% S% 
25 100 
24 98 
23 96 
22 96 
93 
20 92 
19 84 
18 78 
17 71 
16 100 100 64 
15 98 g8 59 
14 98 98 53 
13 98 98 50 
12 98 96 42 
11 100 93 100 95 39 
10 96 85 98 89 35 
9 96 81 98 81 29 
8 go 57 98 76 25 
7 85 46 94 60 23 
6 72 37 83 51 15 
5 59 29 79 43 9 
4 48 15 66 37 4 
3 31 10 48 28 1 
2 11 4 33 17 1 
I 7 14 14 : 
° 3 5 7 
+1 3 3 
+2 
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are drawn from memory in the absence 
of an objective stimulus; even under 
such circumstances the incidence of dis- 
tortions is approximately equal for the 
two contrasting criterion groups. ‘The 
writer would agree with Billingslea and 
with Pascal and Suttell that gross devia- 
tions are nondiscriminating and that, if 
any separation is to be effected, more 
subtle modifications must be defined, 
observed, and isolated. ‘The writer would 
further agree that a feasible scoring 
scheme should take into account these 
minute deviations, but exception is taken 
with Billingslea’s belief that such a 
method of scoring should be accom- 
plished by equally precise physical meas- 
urements. Billingslea took as long as 
fifteen hours to score a single record— 
a procedure which might be justified 
in the research laboratory but not in the 
psychological clinic. The writer agrees 
with Pascal and Suttell that scoring 
should be accomplished by inspection 
rather than measurement but that the 
scoring criteria should be so carefully 
described that no great loss is experi- 
enced in going from quantitative meas- 
urement to qualitative observation. 
Moreover, this qualitative inspection 
should be supported by quantitative 
normative data rather than by loose gen- 
eralizations based on clinical experience 
with a predominantly abnormal popu- 
lation. 

In the present investigation, none of 
the “abnormal” signs occurs exclusively 
in the abnormal subjects, and none of 
the “normal” signs is found solely in the 
control subjects—a finding similar to 
that reported by Anastasi and Foley (5) 
in their experimental study of drawings 
but differing from the results of Pascal 
and Suttell, who give maximum weight 
to signs having zero frequencies in their 
control group and frequencies of one or 
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two in their patient group. It might also 
be noted in this connection that one 
might have entertained any one of four 
hypotheses prior to the present study: 
(a) deviations of the reproductions from 
the stimulus figures are more frequent 


among abnormals than normals, i.e., 
normals tend to conform more strictly to 
the stimuli; (b) deviations are more fre- 
quent among normals, i.e., abnormals 
tend to conform more rigidly to the 
stimuli; (¢) deviations are equally char- 
acteristic of both normals and abnor- 
mals; and (d) certain deviations are more 
characteristic of normals and others more 
characteristic of abnormals. Most  cli- 
nicians implicitly assume hypothesis a, 
a corollary of which is that the more 
nearly the reproductions approximate 
the originals, the more likely the exami- 
nee is to be normal; and that the more 
the record deviates from the original the 
greater the severity of the disturbance. 
The hypothesis that deviations are more 
characteristic of abnormals is supported 
by the fact that in the present investiga- 
tion 31 of the 4o initial test signs (77.5 
per cent) and 35 of the 41 retest signs 
(85.4 per cent) which discriminate at the 
5 per cent level are negative in sign, i.e., 
occur more frequently among neurotics. 
Hypotheses b and c¢ are rejected on this 
basis, while hypothesis d is given support. 
Billingslea also found signs which dis- 
criminated in both directions, and such 
signs are implicit in Hutt’s formulations. 
The two relevant hypotheses may there- 
fore be combined and restated as a con- 
clusion, namely, that deviations of the 
reproductions from the originals occur 
more often in the records of maladjusted 
individuals, and that both types of devia- 
tions should be considered in evaluat- 
ing Bender-Gestalt Test performance. 
Whether the degree of psychological dis- 
turbance progressively parallels an in- 
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crease in the excess of “abnormal” over 
“normal” signs cannot be ascertained 
from the present study, since the subjects 
were not graded according to severity of 
disturbance and since, unlike Pascal and 
Suttell’s study, psychotic subjects were 
not tested. A high correlation between 
size of score and severity of disturbance 
seems unlikely, however, in view of the 
high degree of overlapping of the control 
and experimental groups, an overlapping 
which is especially pronounced in the 
cross-validation study. Cut-off scores may 
be set which will separate normals and 
abnormals in varying proportions, the 
normals tending to make more favorable 
(lower) scores and the abnormals more 
unfavorable (higher) scores, but as the 
critical score is raised there is a steady 
increase in false negatives. Although it 
is likely that psychotics will tend to make 
higher scores than neurotics and normals, 
the interpretation of increasing severity 
of disturbance with increasing score 
within the neurotic or maladjusted 
groups in the present investigation is 
not warranted by the data. The presence 
of three or four signs which occur very 
rarely in the standardization group (e.g., 
total or part rotation, dissociation, frag- 
mentation, compartments, numbering, 
perseveration, confused sequence) may 
be much more significant than the occur- 
rence of eight or nine deviations that are 
found more often in both groups. It 
should also be noted that the scoring 
scheme developed in the present study is 
intended to differentiate adult normals 
from adult neurotics and pooriy adjusted 
individuals and that its applicability to 
psychotics, organics, children, mental de- 
fectives, etc. is not known. There is no 
reason, however, why the method could 
not be extended to such nosological 
groups by appropriate item-analysis tech- 
niques. 


The hypothesis that differences in test 
performance between normals and neu- 
rotics would be accentuated in the ab- 
sence of an objective stimulus during 
recall is not given strong support by our 
findings. Only six discriminating signs 
could be isolated in the initial validation 
study, and the difference between the 
means of the criterion groups was con- 
siderably less than those obtained for 
initial and retest scores. When applied 
to the cross-validation groups, the dif- 
ferences on the recall test disappeared 
almost completely. If, however, a retest 
or a combined score is desired, it is neces- 
sary to administer the recall test, since 
the original data were obtained under 
these conditions. It seems likely that the 
retest and combined scores would not 
differ markedly from those obtained in 
the present investigation, even if the re- 
call test were omitted, but this is a prob- 
lem for future research, and until the 
results of such research are forthcoming, 
it is recommended that the present meth- 
od of administration be adhered to. If 
time is limited, the difficulty may be 
surmounted by giving only the initial 
test, for the results obtained under these 
conditions are in essential agreement 
with those found with the longer pro- 
cedure, except for the fact that greater 
confidence can be attached to the findings 
when initial, retest, and combined scores 
are consistent in their diagnostic impli- 
cations. 

With regard to the scoring method 
developed in the present study, the pre- 
diction that it would be successful in 
differentiating well-adjusted from poorly 
adjusted individuals is verified by the re- 
sults of the cross-validation study. The 
differences between the means of these 
contrasting groups are smaller than those 
obtained in the initial validation, but 
the differences emerge as statistically sig- 


nificant despite the anticipated shrinkage 
in validity. The fact that the scoring 
scheme can be applied to nonveterans 
as well as veterans, to females as well 
as males, and to individuals who are 
maladjusted but not necessarily neurotic, 
and the additional fact that differences 
in age (within the range of 18 to 50), 
education (within the high school and 
college levels), and intelligence (for IQ's 
of 85, and above) do not appreciably in- 
fluence the score indicates that the Ben- 
der-Gestalt can serve a useful screening 
function, not only in mental hygiene 
clinics but also in vocational guidance 
and psychological counseling agencies 
dealing with normal individuals and 
persons representing varying degrees of 
emotional disturbance short of psychosis. 
It is recommended as a screening device 
rather than as an instrument for elabo- 
rate personality interpretation because 
the differential incidence of most of the 
deviations is not great and because many 
of the signs said to have particularized 
meaning for neurotics prove to be non- 
discriminating. Interpretations of “flexi- 
bility” or “spontaneity” based on logical 
sequence, “deterioration” or “emotional 
immaturity” based on substitution of 
dashes or circles for dots, “depression” 
based on downward slope, “elation” 
based on upward slope, “instability” 
based on curvature irregularities, “neu- 
rotic sex conflict” based on closure or 
elongation or repetitive sketching of 
projections, “inferiority” or “insecurity” 
based on reduction in size, and “expan- 
siveness” or “oppositional traits” based 
on increase in size appear to be prema- 
ture in the light of the available evi- 
dence. Clinicians working almost exclu- 
sively with abnormal patients may estab- 
lish certain correlations between some of 
these signs and information gathered 
about the patient from behavioral, clini- 
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cal, or psychometric data, but it is sub- 
mitted that they might well experience a 
keen feeling of disillusionment at the ab- 
sence of such correlations in work with 
comparatively well-adjusted people. 

The cut-off score to be used for screen- 
ing purposes will depend upon the par- 
ticular purpose to be achieved. If it is 
desired to screen out a great proportion 
of the maladjusted individuals in a given 
sample, the critical score must be lowered 
despite the relatively high incidence of 
false positives. Reference to the distri- 
butions of initial, retest, and combined 
scores in Table 11 should prove helpful 
to the clinician in interpreting a given 
score in terms of the chances that such 
a score is likely to be made by an emo- 
tionally disturbed individual. These 
tables may readily be converted into ex- 
pectancy tables in personnel-selection 
situations where there is greater interest 
in group prediction than in individual 
predictions. For general purposes, it is 
suggested that cut-off scores be set at one 
standard deviation above the mean of 
the control groups in Table 11. Disre- 
garding fractions, these critical scores are 
7, 6, and 1g for the initial, retest, and 
combined scores, respectively. On the 
initial test, a score above 7 screens out 
53-1 per cent of the maladjusted cases 
at the cost of 14.8 per cent false positives; 
on the retest, a score above 6 screens out 
48.4 per cent at the cost of 16.7 per cent 
false positives; a combined score above 
13 picks up 50.0 per cent of the malad- 
justed cases at the expense of 11.1 per 
cent of the normals. 

What can be said of the reliability of 
the scoring system developed in the pres- 
ent study? The concept of reliability is 
inherent in the method used to deter- 
mine item validity, since only those signs 
which differentiated in a consistent direc- 
tion on both the initial test and the re- 
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test were retained for inclusion in the 
final scoring system. Correlation of the 
initial and retest scores for the entire 
initial validation group of 393 cases re- 
sulted in a reliability coefficient of .67, 
while a similar correlation for the 82 
male subjects in the cross-validation 
study gave rise to a coefficient of .68. 
These figures are remarkably close to the 
test-retest coefficient of .71 reported by 
Pascal and Suttell for 44 subjects after 
an interval of 24 hours. The present 
writer takes the position that Bender- 
Gestalt test-retest correlation coefficients 
are not true reliability coefficients, but 
are underestimations of the true relia- 
bility. As is true on most performance 
tests—and the Bender-Gestalt is no excep- 
tion—a test administered a second time 
does not measure the same factors tapped 
by the initial administration. In the case 
of the Bender-Gestalt, the initial test 
constitutes an unfamiliar task, while the 
retest represents a relatively familiar situ- 
ation. Each succeeding design can be 
anticipated before it is actually pre- 
sented, and problems of size and spatial 
arrangement can be readily solved on the 
basis of prior experience. Memory factors, 
such as remembering the number of dots 
in figure 1 or the number of columns in 
figure 2, may obviate counting of the 
elements in the stimulus figures or reduce 
the time required to reproduce the fig- 


ures. If the interval between test and re- 
test is excessively long, these memory 
factors will be reduced to a minimum, 
but changes in the subject’s adjustment 
or motivation may preclude consistency 
of measurement. Analysis of the present 
findings shows that there is better organi- 
zation and apportionment of space on 
the retest, resulting in more logical se- 
quence, that less time is required to re- 
produce the figures, that the reproduc- 
tions tend to be smaller, and that count- 
ing is less frequent on the second admin- 
istration of the test. It may be inferred 
from the above, although it cannot be 
demonstrated, that the subjects approach 
the retest with greater confidence and 
that the higher degree of structuration of 
the retest makes it a more pleasurable 
or less trying experience. The extent to 
which the intervening recall test influ- 
ences the test-retest correlation cannot be 
determined, but it probably is not great 
in view of the similarity of the present 
correlations with those reported by Pascal 
and Suttell, who did not employ the re- 
call situation. It is true, however, that 
our retest scores are systematically lower 
than initial test scores, whereas Pascal 
and Suttell find a negligible practice 
effect. However, they did not score for 
such factors as counting and time, both 
of which contribute markedly to a de- 
creased retest score. 


Vil. SUMMARY AND 
CONCLUSIONS 


The Bender-Gestalt ‘Test was admin- 
istered to 393 white male adult veterans 
(108 neurotics and 285 controls) in an 
initial validation study and to 118 white 
nonveteran adults of both sexes (64 
poorly adjusted individuals and 54 con- 
trols) in a cross-validation study. The 
test was administered twice to each sub- 
ject in conjunction with an interpolated 


test of immediate recall. A total of 1,533 
test records (511 initial tests, 511 retests, 
and 511 recall tests) was scored and ana- 
lyzed. An objective scoring system con- 
sisting of 82 general categories and 312 
specific signs was developed, and the in- 
cidence of each of these signs was deter- 
mined for each test record of the cases 
in the initial validation study. Thirty 
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signs meeting specified criteria of statisti- 
cal significance and consistency were iso- 
lated and constitute the final scoring 
system for the initial test and the retest. 
An additional six signs were evolved for 
the scoring of the recall test records, The 
original test records were then rescored 
with the selected signs and satisfactory 
separations of the normal and neurotic 
cases were achieved, The test records of 
the subjects in the cross-validation study 
were next scored with the same signs, 
and statistically significant separations of 
the poorly adjusted and well-adjusted 
cases obtained. 

The results warrant the following con- 
clusions: 

1. Normals are much more alike than 
different with regard to perceptual-motor 
response to simple geometric designs, to 
the extent that such responses are meas- 
ured by reproduction of the Bender- 
Gestalt figures from a model or from 
memory. In this respect, the findings 
agree to a greater extent with those of 
Bender and Billingslea than with those 
of Hutt and Pascal and Suttell. 

2. Gross destructions of the Gestalt 
are quite rare in both normals and neu- 
rotics, except when the reproductions 
are drawn from memory, and even under 
such circumstances the incidence of dis- 
tortions is approximately the same for 
the two contrasting groups. If any sep- 
aration of normals from neurotics is to 
be effected, the evidence suggests that it 
must be achieved through consideration 
of more subtle modifications of the test 
figures, 

3. Deviations of the reproductions 
from the stimuli occur more often in 
the records of the abnormal subjects, but 
certain deviations are more characteristic 
of well-adjusted individuals, suggesting 
that both types should be considered in 
evaluating test performance. 
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4. None of the “abnormal” - signs 
occurs exclusively in the abnormal sub- 
jects, and none of the “normal” signs is 
found solely in the controls. 

5. General factors said to apply to all 
or almost all of the figures, such as re- 
duction in size, rotation, sketching, or 
angulation, are largely nonexistent, and 
any interpretation of test performance 
which assumes such factors is likely to 
be extravagant or misguided. The evi- 
dence supports the view that deviations 
are, for the most part, specific to specific 
figures. 

6. Constituting the final scoring. sys- 
tem evolved in this study are go signs 
which discriminate consistently between 
normals and neurotics on the initial test 
and the retest. 

7. Neurotics and normals can be dis- 
tinguished on the basis of initial, retest, 
recall, or combined initial and retest 
scores, but the recall score is much less 
discriminating than the others and fails 
completely on cross validation. With re- 
gard to discriminating power for all the 
populations sampled, the combined score 
is somewhat superior to the initial score, 
which is in turn slightly more effective 
than the retest score. 

8. The expected shrinkage in validity 
is found in the cross-validation study, 
resulting in a greater degree of over- 
lapping of the distributions of the mal- 
adjusted and well-adjusted cases, but dif- 
ferences between the means of the two 
contrasting groups, significant at the 1 
per cent level, are nevertheless obtained 
for the initial, retest, and combined 
scores. \ 

g. Retest reliability (initial vs. retest 
scores) is represented by correlations of 
.67 and .68 for the original and cross- 
validation groups, but these coefhcients 
are regarded as underestimations of the 
true reliability on the grounds that a 
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second administration of the test does 
not duplicate the first administration. 
The chief reason for this lack of replica- 
tion is that the retest is less amorphously 
structured for the subject because of in- 
creased familiarity with the requirements 
of the test situation. Though not demon- 
strated, it is highly probable that the 
reliability of the combined score is 
higher than that of the initial test or 
retest alone. 

10. The scoring method developed in 
this investigation is not significantly re- 
lated to sex, age (within the range of 18 
to 50), education (for persons with eighth- 
grade schooling or better), or intelli- 
gence (for IQ’s of 85 and above). 

11. Hutt’s “neurotic syndrome” is in 
general not substantiated by the pres- 
ent findings. Many of Hutt’s psycho- 
dynamic interpretations are also open 
to question in view of the fact that a 
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number of the deviations said to be 
indicative of emotional or motivational 
processes occur equally often among 
normals and neurotics. 

12. The scoring method developed by 
Pascal and Suttell fails to differentiate 
the 138 normal and neurotic subjects of 
the matched criterion groups. Contrary 
to their findings, there is no significant 
relationship between education and test 
performance for adults, and this state- 
ment holds true whether Pascal and 
Suttell’s scoring system or the one 
evolved in the present investigation is 
employed. 

13. The Bender-Gestalt Test, as scored 
in the present study, is recommended as 
a screening device to be used as a sup- 
plement to other psychodiagnostic tests 
rather than as an instrument for the 
elaborate interpretation of individual 
personality dynamics. 
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